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Munich 



MOTION ESTIMATOR/COMPENSATOR INCLUDING 
A 16-BIT 1/8 PEL INTERPOLATION FILTER 



The present invention relates to an improved motion estimation and compensation. In 
particular, the present invention relates to an efficient 16-bit implementation of a 1/8- 
pel interpolation filter for use in motion estimation and compensation. 

Motion pictures are being adopted in an increasingly number of applications ranging 
from video telephony and video conferencing to DVD and digital television. When 
motion pictures are being transmitted, a substantial amount of data has to be sent 
through conventional transmission channels of a limited available frequency 
bandwidth. In order to transmit the digital data through the limited channel bandwidth, 
it is inevitable to compress or reduce the volume of the transmission data. 

In order to enable inter-operability between systems designed by different 
manufactures of any given application, video-coding standards have been developed 
for compressing the amount of video data. The coding approach underlying most of 
these standards consist of the following main stages: 

(1) Dividing each video frame into blocks of pixels such that a processing 
of the video frame can be conducted at a block level. 
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(2) Reducing spatial redundancies within a video frame by subjecting 
video data of a block to a transformation, quantization and entropy 
coding. 

(3) Exploiting temporal dependencies between blocks of subsequent 
frames in order to only transmit changes between subsequent 
frames. 

Temporal dependencies between blocks of subsequent frames are determined by 
employing a motion estimation and compensation technique. For any given block, a 
search is performed in previously coded and transmitted frames to determine a 
motion vector, which will be used by the encoder and decoder to predict the image 
data of a block. 

An example of a video encoder configuration is illustrated in Fig. 1. The shown video 
encoder generally denoted with reference numeral 100 comprises a transform unit 
120 to transform spatial image data to the frequency domain, a quantization unit 120 
to quantize the transform coefficients provided by the transform unit, a variable length 
coder 190 for entropy encoding the quantized transform coefficients and a video 
buffer (not shown) for adapting the compressed video data having a variable bit rate 
to a transmission channel which may have a fixed bit rate. 

The encoder shown in Fig. 1 employs a DPCM (Differential Pulse Code Modulation) 
by only transmitting differences between subsequent fields or frames. These 
differences are obtained in subtracter 110, which receives the video data to be 
encoded and subtracts the previous image therefrom. The previous image is 
obtained by decoding the previously encoded image ("currently decoded image"). 
This is accomplished by a decoder, which is incorporated into video encoder 100. 
The decoder performs the encoding steps in a reverse manner, i.e. the decoder 
comprises an inverse quantizer 130, an inverse transform unit 130 and an adder 135 
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for adding the decoded changes to the previously decoded image in order to produce 
the image as it will be obtained on the decoding side. 

In motion compensated DPCM, a current frame or field is predicted from image data 
of a previous frame or field based on an estimation of the motion between the current 
and the previous images. Such estimated motion may be described in terms of 2- 
dimensional motion vectors representing the displacement of pixels between the 
previous and the current images. Usually, motion estimation is performed on a block- 
by-block basis. An example of the division of the current image into plurality of blocks 
is illustrated in Fig. 2. 

During motion estimation, a block of a current frame is compared with blocks in 
previous frames until a best match is determined. Based on the comparison results, 
an inter-frame displacement vector for the whole block can be estimated for the 
current frame. For this purpose, a motion estimator 170 is incorporated into the 
encoder together with the corresponding motion compensator 160 included into the 
decoding path. 

The video encoder of Fig. 1 is operated as follows. A given video image of a video 
signal is divided into a number of small blocks, usually denoted as "macro blocks". 
For example, video image shown in Fig. 2 is divided into a plurality of macro blocks, 
each of which usually having a size of 16x16 pixels. 

When encoding the video data of an image by only reducing spatial redundancies 
within the image, the resulting frame is referred to as an l-picture. I-pictures are 
typically encoded by directly applying the transform to the macro blocks of a frame. 
Encoded l-pictures are large in size as no temporal information is exploited to reduce 
the amount of data. 

In order to take advantage of temporal redundancies that exist between successive 
images, a prediction encoding between subsequent fields or frames is performed 
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based on motion estimation and compensation. When a selected reference frame in 
motion estimation is a previously encoded frame, the frame to be encoded is referred 
to as a P-picture. In case both, a previously encoded frame and a future frame are 
chosen as reference frames, the frame to be encoded is referred to as B-picture. 

Although the motion compensation has been described to be based on a 16x16 
macro block, motion estimation and compensation can be performed using a number 
of different blocks sizes. Individual motion vectors may be determined for blocks 
having 4x4, 4x8, 8x4, 8x8, 8x16, 16x8, or 16x16 pixels. The provision of small motion 
compensation blocks improves the ability to handle fine motion details. 

Based on the results of the motion estimation operation, the motion compensation 
operation provides a prediction based on the determined motion vector. The 
information contained in a prediction error block resulting from the predicted block is 
then transformed into transform coefficients in transform unit 120. Generally, a 2- 
dimensional DCT (Discrete Cosine Transform) is employed. The resulting transform 
coefficients are quantized and finally entropy encoded (VLC) in entropy encoding unit 
190. 

The transmitted stream of compressed video data is received by a decoder, which 
reproduces a sequence of encoded video images based on the received data. The 
decoder configuration corresponds to that of the decoder included in the encoder 
shown in Fig. 1. A detailed description of a decoder configuration is therefore 
omitted. 

In order to improve the accuracy of motion compensation, a sub-pixel accuracy of 
reference frames is widely used. By employing a 1/8 sub-pixels motion vector 
accuracy, the coding efficiency can be significantly improved over a 1/2 or 1/4 sub- 
pixel resolution. 
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In order to further increase the motion vector accuracy and coding efficiency, a 1/3 
and 1/6 sub-pixel vector accuracy have been proposed in EP-A-1 073 276. 

The motion vector accuracy and coding efficiency can further be increased by 
applying interpolation filters in motion estimation and compensation yielding 1/8 sub- 
pixel displacements. However, such a sub-pixel resolution requires high computation 
complexity, in particular, calculation registers having a length of up to 25 bits. 

Such a complex implementation may be based on a 2-step approach. A first step 
calculates a 1/4 sub-pixel image employing an 8-tap filter. A second stage calculates 
a 1/8 sub-pixel from the 1/4 sub-pixel image by employing a bilinear filtering. 

The filtering operation for generating the image with the 1/4 sub-pixel accuracy 
comprises the steps of horizontal and subsequent vertical filtering. The horizontal 
interpolation may be performed based on the following equations: 

h x = -3-A 4 + 12B 4 -37 C 4 +229-A, +71E 4 -21F 4 + 6G 4 -\H 4 
h 2 = -3-A 4 + 12B 4 -39 C 4 + 158-£> 4 + 158£ 4 -39 F 4 +12 G 4 -3H 4 
h 3 =-l-A 4 +6-B 4 -21-C 4 +71-£> 4 +229-£ 4 -37 F 4 +12-G 4 -3-H 4 

In the above equations, his denote the 1/4 sub-pixel values and A x -H x represent the 
original full-pel pixel values, i.e. the pixels from the original image. 

The horizontal filtering is illustrated in Fig. 3. An 8-tap filtering is performed based on 
original pixel values 210 and three intermediate pixel values 220 are calculated in 
order to obtain a 1/4 sub-pixel accuracy in horizontal direction. 
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After the horizontal filtering has been completed, the resulting image data having a 
full-pel pixel accuracy in vertical direction and 1/4 sub-pixel accuracy in horizontal 
direction are subjected to vertical filtering. For this purpose, preferably equations 
having coefficients which correspond to those of the above described horizontal filter 
are employed. 



v, =-3D, +12-D 2 -37D 3 + 229-£> 4 + 71-D 5 -2\D 6 +6-D 1 -\.D S 

v 2 = -3-D l +12-D 2 -39- D 3 +158 Z> 4 + 158£> 5 -39Z> 6 +12-.D7 -3D S 

v 3 =-l-D 1 +6-D 2 -21-Z> 3 +71-£> 4 +229-D s -37-D 6 +12-D 7 -3-D s 

In the above equations, V] _ 3 refer to the calculated vertical 1/4 sub-pixel values and 
Dj, D 2 , D 3 . D 4 . D 5 , D 6i D 7 and D 8 represent the full-pel resolution pixels, i.e. the original 
pixels 210 and the intermediate pixels 220 obtained during horizontal filtering. 

The resulting pixel values have a length of up to 25 bits. In order to obtain image data 
wherein each of the pixel values fall into a predefined range of allowable pixel values, 
the calculation results are downshifted and rounded as illustrated, by the example, 
for pixel value v y : 



vj represents the pixel value resulting from the horizontal and vertical filtering, while 
Vy represents the downshifted pixel value. The downshifted pixel values are further 
clipped to a range of 0 to 255. 

The vertical filtering is illustrated in Fig. 4. The pixel values 230 obtained during 
vertical filtering complete the sub-pixel array illustrated by way of example between 
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original pixels D 4l D 5) E 4 and E 5 . Corresponding pixel values are calculated for the 
entire image although not shown in the drawings. 

After having the 1/4 sub-pixel image completed, a 1/8 sub-pixel frame is calculated 
by applying a bilinear filtering to the 1/4 sub-pixel resolution. In this manner, 
intermediate pixels are generated between each of the 1/4 resolution pixels. 

A bilinear filtering is applied in two steps and is illustrated by way of example in Fig. 5 
and Fig. 6. Starting from the 1/4 sub-pixel resolution, Fig. 5 illustrates the application 
of a horizontal and vertical filtering. For this purpose, a mean value is calculated from 
the respective neighbouring pixel values in order to obtain an intermediate pixel value 
of a 1/8 sub-pixel resolution. When employing a binary representation for this 
processing, the following equation can be applied: 

A = (B + C+l)»l 

The remaining 1/8 sub-pixel values to be interpolated are calculated by diagonal 
filtering as illustrated in Fig. 6. It is a particular advantage of this approach that for the 
bilinear filtering sub-pixel values stemming from multiple filtering are avoided as far 
as possible. For this purpose, only those interpolated pixels are taken into account 
which are preferably directly derived from original pixel values 210, i.e. the 
interpolated pixel value located between original pixel values 210. 

All intermediate pixel values can be calculated therefrom, i.e. from the original pixel 
values 210 and the intermediate pixel values derived from the original pixel values, 
when additionally taking centre pixel 240 of the sub-pixel array into account. The 
calculation operation for the additional 1/8 sub-pixel values is based on two of the 1/4 
sub-pixel resolution values, respectively. The individual pixel values taken into 
account for the calculation of an intermediate pixel value are illustrated in Fig. 6 by 
respective arrows. Depending on the distance of the pixels to be taken into account 
for interpolation, the following two equations can be employed: 
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D = (E+F + l)»l 
G = (3H + I + 2)»2 

D and G represent new intermediate values as illustrated in Fig. 6 and E, F, H and / 
represent the pixel values from the 1/4 resolution image. The additional values of "1" 
and "2" in the above equations only serve for correctly rounding the calculation result. 

It is a particular disadvantage of such an interpolation approach that long registers 
are needed which result in high hardware complexity and computational effort. 

In view of this drawback, the present invention aims to provide a motion estimation 
and compensation method requiring a lower computational effort and lower hardware 
complexity. 

This is achieved by the features of the independent claims. 

According to a first aspect of the present invention, a method for estimating or 
compensating motion between images of a sequence of video images is provided. 
The method includes a step of interpolating pixel values. The interpolation first 
calculates intermediate pixel values of a 1/4 pixel resolution and subsequently, 
bilinear filters the pixel values of the 1/4 pixel resolution for determining additional 
pixel values in-between to obtain pixel values of a 1/8 pixel resolution. The 
calculation of intermediate pixel values of the 1/4 pixel resolution includes a first 
filtering step in a first direction for calculating intermediate pixel values based on pixel 
values of a current image in accordance with the following equations: 

hi=-l A h + 3- Bh-IO- C h + 59- D h + 18E h - 6 F h + 1 G h -0- H h , 
h 2 = -1 ■ A h + 4 ■ B h - 10 ■ C h + 39 ■ D h + 39 ■ E h ~ JO ■ F h + 4 • G h - 1 ■ H h , 
h 3 =-0- A h + 1- B h - 6- C h + 18- D h + 59- E h -10- F h + 3- G h -1 ■ H h 
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wherein A h , B h , C h , D h , E h , F h , G h and H h represent neighboring pixei values of the 
current image in the first direction of pixels and h„ h 2 , and h 3 represent the 
intermediate pixel values calculated between two of the neighbouring pixel values of 
the current image. The binary representations of the calculated intermediate p.xel 
values are downshifted in a first shifting step by 6 bits. A second filtering step 
calculates intermediate pixel values based on pixel values obtained from the first 
filtering step and the first shifting step in accordance with the following equations: 



W = -1 ■ Dv.3 + 3 • D v . 2 - JO • A.; + 59 ■ D v + 18 ■ D v+J - 6D v+2 + l- D v+3 - 0 ■ D v+4 , 
V2 = -l-D v . 3 + 4- D v . 2 - 10 ■ D v ., + 39D v + 39- D v+1 -10 D v+2 + 4 ■ D v+3 - 1 • D v+4 , 
V3 = .0 ■ D v . 3 + 1- D v . 2 - 6 ■ D v ., + 18 ■ Dv + 59 ■ D v+7 - 10 • D v+2 + 3 ■ D v+3 - 1 ■ D v+4 



wherein D v ,, D v ,D v ,D v ,D v+J ,D v+ ,D v+3 anciD v+4 represent neighboring pixel values .n 
the second direction of pixels and v 7l v 2l and v 3 represent the intermediate p.xel 
values calculated between two of the neighbouring pixel values. The bmary 
representations of the intermediate pixel values calculated in the second filtering step 
are downshifted in a second shifting step by 6 bits. 

According to a further aspect of the present invention, a motion estimator or 
compensator for estimating or compensating motion between images of a sequence 
video images is provided. The motion estimator includes a pixel interpolator for 
interpolating pixel values in a video image. The pixel interpolator includes a calculator 
for calculating intermediate pixel values of a 1/4 pixel resolution and a bilinear filter 
for filtering the pixel values of the 1/4 picture resolution for determining additional 
pixel values in-between to obtain pixel values of a 1/8 pixel resolution. The calculator 
for the 1/4 picture resolution includes a first filter, a first shifting unit, a second filter 
and a second shifting unit. The first filter calculates intermediate pixel values based 
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on pixel values of a current image in a first direction in accordance with the following 
equations: 



h 3 = -1 * A h + 3 B h - 10 * C h + 59 D h + 18 E h - 6 -F h + 1 • G h -0 • H h , 
h 2 = -1 -A h + 4 B h - 10 • C* + 39 • D h + 39 -E h - 10 F h + 4 G h - 1 H h 
h 3 ~~0*A h ± I B h - 6C h + 18- D h + 59- E h - 10 • F h + 3 • G h - 1 • H h 



wherein A h , B ht C h , D h , E h , F ht Gh and H h represent neighboring pixel values of the 
current image in the first direction of pixels and h h h 2l and h 3 represent the 
intermediate pixel values calculated between two of the neighbouring pixel values of 
the current image. The first shifting unit downshifts the binary representation of the 
pixel values from the first filter by 6 bits. The second filter calculates intermediate 
pixel values based on pixel values obtained from the first filter and the first shifting 
means in a second direction in accordance with the following equations: 



v ; = -/ • £W 4- 3 • Z) v -2 ~ 10 • A>-/ + 59 • Dy + 18 • Ah-; - 6 - D v+2 + 1 • D v+3 - 0 • D v+4 , 
v 2 « -/ • Dv_5 + 4 D V _2-10- D v „j 4- 39 • D v + 39 • D v+I - 10 ■ D^ 2 + 4 ■ D v+3 - 7 ■ D v+4 , 
v 3 ~ -0 - LW + 7- Dv_2- <5 • D v .j + I8 > D v + 59- D v +j - 10 - Z) v+2 + 3 • D v+3 -l • Z) v -w 



wherein D v „ 3l D v . 2f D v „ Jt D Vt D v + It D v + 2( D v+3 ar\dD v +4 represent neighboring pixel values in 
the second direction of pixels and v Ix v 2y and v 3 represent the intermediate pixel 
values calculated between two of said neighbouring pixel values. The second shifting 
unit downshifts the binary representation of intermediate pixel values from the 
second filter by six bits. 
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It is the particular approach of the present invention to provide a 1/8 pixel resolution 
by only employing 16-bit register means. By selecting appropriate coefficients for the 
horizontal and vertical filtering for the 1/4 sub-pixel calculation and applying a 6-bit 
downshift in-between, a 1/8 sub-pixel accuracy can be achieved without complex 
computation. During the interpolation processing, all intermediate calculations do not 
require a register exceeding a 16-bit accuracy. Consequently, motion estimation and 
compensation can be improved by employing an increased pixel accuracy without 
increasing the computational effort accordingly. 

Based on the improved motion estimation and compensation, the encoding and 
decoding efficiency of images can be improved in a corresponding manner without 
increasing the hardware and computational complexity accordingly. 

Preferred embodiments of the present invention are the subject matter of the 
dependent claims. 

Other embodiments and advantages of the present invention will become more 
apparent from the following description of the preferred embodiment in which: 

Fig. 1 illustrates in block diagram for the configuration of a video encoder; 

Fig. 2 illustrates the division of a video image into a plurality of blocks, 

Fig. 3 illustrates the horizontal filtering for obtaining a 1/4 sub-pixel accuracy 

in horizontal direction, 

Fig. 4 illustrates a vertical filtering for obtaining a 1/4 sub-pixel accuracy in 

vertical direction, 
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illustrates a horizontal and vertical bilinear filtering for obtaining a 1/8 
sub-pixel accuracy, 

illustrates a bilinear filtering in diagonal direction for obtaining a 1/8 sub- 
pixel accuracy, 

illustrates the coding results of the present invention compared to 
conventional approaches for a first example image, and 

illustrates the coding results of the present invention compared to 
conventional approaches for a second example image. 



In video encoding, the coding efficiency is increased by applying motion estimation 
and motion compensation in predictive coding. The estimation and compensation of 
motion can be improved by reducing the difference remaining between the image 
data to be encoded and the predictive image data. In particular, a 1/8 sub-pixel 
motion vector accuracy can further improve the coding efficiency. 

The present invention achieves an accordingly improved motion estimation and 
compensation without increasing the hardware complexity and the computational 
effort accordingly, as the present invention enables to only employ a 16-bit accuracy 
of intermediate calculation results for this purpose. 

A two-step procedure is employed for obtaining the 1/8 pixel accuracy. In a first 
stage, a horizontal and vertical filtering is subsequently employed. For interpolating 
1/4 sub-pixel values in horizontal direction, the following equations are applied: 
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hi - -/ • A h + 3 B h - 10 ■ C h + 59 ■ D h + 18 ■ E h - 6 ■ F h + 1 • G h - 0 ■ H h , 
h2 = . A h + 4 ■ B h - 10 ■ C h + 39 ■ D„ + 39 ■ E h - 10 ■ F h + 4 ■ G„- 1 ■ H h , 
h 3 =-0 A h + l B h - 6 • C h + 18 - D h + 59 ■ E h - 10 F h + 3 ■ G h - 1 • H h 

In the above equations, hj. 3 represent the 1/4 sub-pixel values to be interpolated and 
A x -H x represent the original full-pl pixel values. 

After completing the horizontal filtering, the calculated values are downshifted. This 
is illustrated in the following equation, by way of example, for the intermediate value 
of A,: 



hi represents the interpolated pixel value resulting from horizontal filtering and h, ' 
represents the respectively downshifted pixel value. A corresponding processing is 
applied to all of the interpolated pixel values resulting from horizontal filtering. 

In a second step of the first stage, the horizontally increased sub-pixel accuracy is 
also obtained in vertical direction. For this purpose, a vertical filtering is applied. The 
previously performed downshift operation provides that none of the intermediate 
calculations exceeds a 16-bit accuracy in the vertical filtering step. The vertical 
filtering is performed by employing the filter coefficients as shown in the following 
equations which correspond to those of the horizontal filtering: 

vi =-/ • D v .3 + 3- D v -2-10- D v .j + 59- D v + 18- D v+] - 6 - D v+2 + 1 ■ D v+3 - 0 ■ D v+4 , 
V2 = _/ . D v . 3 + 4 ■ D v . 2 - 10 ■ A.-/ + 39 ■ D v + 39 ■ D v+J - 10 ■ D v+2 + 4 ■ D v+3 - 1 ■ D v+4 , 
v 3 = -0 ■ D v . 3 + 1- Dv.2- 6 ■ D v .j + 18 D V + 59 D v+J - 10 ■ D v+2 + 3 • D v+3 -l ■ D v+4 
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vj.s refer to the vertical 1/4 sub-pixel values and D v . 3t D v . 2 , D v .,, D v , D v+J , Z) v+2 . D v+3 and 
D v+4 represent the full-pel pixel positions in vertical direction, i.e. pixels 210 and 220 
from Fig. 3. 



The calculation results from the vertical filtering, namely, pixel values 230, are 
subjected to downshifting by applying the following equation which is illustrated, by 
way of example to v } only: 



v 1 =[v 1+T J» 6 



A rounding during the downshift operation is achieved by adding the value 
2 6 /2 = 64/2 to the interpolated pixel value. 

Although, the above description first applies a horizontal filtering and a vertical 
filtering together with respective downshift operations, the skilled person is aware 
that the horizontal and vertical operations may be exchanged to achieve the same 
result. Thus, the vertical filtering may be performed before a horizontal filtering is 
applied. 

The finally obtained sub-pixel values of a 1/4 sub-pixel accuracy are clipped in order 
to be in a range between 0 and 255. 

The obtained 1/4 sub-pixel values are subjected to a bilinear filtering as it has been 
described above in connection with Fig. 5 and Fig. 6 in order to obtain a 1/8 sub-pixel 
resolution. 



The following example demonstrates that the processing of the present invention 
does not require any registers for intermediate pixel values exceeding a 16-bit 
accuracy. 
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Assuming a pixel value range between 0 and 255, the largest possible values during 
a horizontal 8-tap filtering may occur when employing the equation for calculating 
intermediate pixel value h 2 : 

^2 = -l-0 + 4-255 + (-10) 0 + 39-255 + 39-255 + (-10) 0 + 4-255 + (-l)0 
hi = 21930 < 32768 = 2 15 => 156/7 + lbit(sign) 

The resulting pixel value is downshifted as indicated by the following equation: 

(21930 + f)»6 = 343 

The result of the downshift operation is clipped to the range of 0 to 255. 

As demonstrated above, the required pixel accuracy for the largest possible values 
during the filtering operation does not exceed 16-bits. Although the above example 
has only been calculated for the horizontal direction, corresponding coefficients are 
used for the vertical filtering and, thus, identical advantages are achieved. 

The above example only relates to the 1/4 sub-pixel resolution calculation. The 
bilinear filtering for generating a 1/8 sub-pixel resolution only requires a maximum 
accuracy of 10-bits. Thus, a maximum accuracy of 16-bits is sufficient for performing 
all calculations of the present invention. Accordingly, the motion estimation and 
compensation and the encoding and decoding of video data can be improved in a 
simple manner. 

For demonstrating that similar results compared to conventional interpolation 
implementations can be achieved when applying the present invention, the algorithm 
of the present invention has been implemented into the H.264/MPEG encoder 
processing software (JM42). The calculation results are illustrated in Fig. 7 and 
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Fig. 8 by rate distortion curves with indicate the impact on the perceived picture 
quality. Both figures only differ by the example image sequence employed. 

The rate distortion curves of Fig. 7 and Fig. 8 are shown over the bit rate on the x- 
axis and the peak signal to noise ratio (PSNR) on the y-axis representing a measure 
for the introduced distortions. 

Fig. 7 and Fig. 8 demonstrate that the 16-bit implementation of a 1/8 sub-pixel filter 
(1/8-pel 16 bit) does not result in an image quality degradation compared to the 
JM4.2 algorithm (1/8-pe. 25-bit) although the JM4.2 algorithm requires longer 
reg,sters. In addition, the approach of the present invention actually performs better 
than an 1/4 sub-pixel 20-bit encoder (1/4-pel 20 bit). 

Summarizing, the present invention provides an improved motion estimation and 
compensation by only employing a simplified hardware configuration and less 
computational effort. This is achieved by particular filter coefficients and additional 
downshift operations when calculating a 1/4 sub-pixel resolution image. Accordingly 
a more efficient encoding and decoding with a more simple hardware configuration 
can be achieved. 
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EPO- Munich 
12 

1 3. Mi 2004 

CLAIMS 



A method for estimating or compensating motion between images of a 
sequence of video images, said method including a step of interpolating pixel 
values, said interpolation step comprising the steps of: 

calculating intermediate pixel values (220, 230) of a 1/4 pixel resolution, and 

bilinear filtering the pixel values of said 1/4 pixel resolution for determining 
additional pixel values (A) in-between to obtain pixel values of a 1/8 pixel 
resolution 

characterized in that the calculation of intermediate pixel values of said 1/4 
pixel resolution includes 

a first filtering step in a first direction for calculating intermediate pixel values 
(220) based on pixel values (210) of a current image in accordance with the 
following equations: 

hj =-/ A h + 3 Bh-JO- Ct, + 59 D h +J8E h - 6 ■ F„ + 1 ■ Gh- 0 ■ H h , 
h2 = _j . Ah + 4 . Bh - JO ■ C h + 39 ■ D h + 39 • E h - 10 ■ F h + 4 ■ G h - 1 • H h , 
h3 =_ 0 . Ah + 1 . Bh - 6C h + 18- D h + 59- Eh- 10 ■ F h + 3 - G„- 1 ■ H h 

wherein A h , Bh, C h D h , E h , Fh, G h and H h represent neighboring pixel values 
(210) of the current image in said first direction of pixels and h u h 2 , and h 3 
represent the intermediate pixel values (220) calculated between two of said 
neighbouring pixel values (210) of the current image, 
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a first shifting step for downshifting the binary representation of said calculated 
intermediate pixel values (220) by six bits, 

a second filtering step in a second direction for calculating intermediate pixel 
values (230) based on pixel values obtained from said first filtering step and 
said first shifting step in accordance with the following equations: 

V, - -7 * £> v -3 + 3 • D v - 2 - 70 * Dv-y + 59 • D v + 18 • D v+I - 6 - D v+2 + 7 ■ A** - 0 • 

v 2 = -7 • £W + ¥ ■ £> v -2 - 70 • £W + 39 • £> v + 3P ■ £W; - 70 ■ Z) v+ 2 + ¥ ■ 7? v+5 - 7 • £W 

vj - -0 * ZVj + 7- Dv-2- * A>-j + 18 • jD v + 59 ■ D v+7 - 70 - 7) v +2 + 3 - D v + 3 - 1 - Ah* 

wherein A-* A-2, Au/, A, A+j, Ah*. A+3 and A+* represent neighboring pixel 
values in said second direction of pixels and vj, v 2 , and v 3 represent the 
intermediate pixel values (230) calculated between two of said neighbouring 
pixel values, and 

a second shifting step for downshifting the binary representation of said 
intermediate pixel values (230) calculated in said second filtering step by six 
bits. 

2. A method according to claim 1 , wherein said first and said second directions of 
pixels are the horizontal and vertical direction. 

3. A method according to claim 1 or 2, further comprising the step of clipping 
each calculated intermediate pixel value to a predefined range of allowable 
pixel values after said first and/or second shifting step. 
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4. A method according to claim 3, wherein the pixels of the image have a 
resolution of 3 bits and an allowable pixel value range from 0 to 255. 

5. A method according to any of claims 1 to 4, wherein said bilinear filtering step 
calculating intermediate pixel values (A) by applying a mean value calculation. 

6. A method according to any of claims 1 to 5, wherein said bilinear filtering is 
applied in horizontal and vertical direction. 

7. A method according to any of claims 1 to 5, wherein said bilinear filtering is 
applied in diagonal directions. 

8. A method according to claim 7, wherein said bilinear filtering is based on pixel 
values (E, F, H, I) which are directly derived from the pixel values (210) of the 
original image. 

9. A method according to claim 8, wherein said bilinear filtering is additionally 
based on the centre pixel value (240) between four pixel values (210) of the 
original image. 

j 

10. A method according to any of claims 7 to 9, wherein said bilinear filtering 
applying the following equations depending on the two pixel values (E, F; H, I) 
to be taken into account: 



GR0NECKER • KINKELDEY - STOCKMAIR & SCHWANHAUSSER 



-21 - 



EP31915 



D = (E + F + l)»l > 
G = (3H + I + 2)»2 

wherein D and G represent said bilinear filtered pixel values, E, F, H and / 
represent pixel values taken into account for said bilinear filtering, and » 
represents a binary downshift operation. 



11. A method for encoding a sequence of video images employing motion 
estimation in accordance with any of claims 1 to 10. 



2. A method for decoding a sequence of encoded video images employing 
motion compensation in accordance with any of claims 1 to 1 0. 



3. A method for decoding a sequence of encoded video images, the decoding 
method including the step of calculating image data of a 1/8 sub-pixel 
resolution wherein the image data of a 1/8 sub-pixel resolution is calculated in 
accordance with the method of any of claims 1 to 10. 



14. A motion estimator or compensator for estimating or compensating motion 
between images of a sequence of video images, said motion estimator 
including a pixel interpolator for interpolating pixel values in a video image, 
comprising: 

a calculator for calculating intermediate pixel values (220, 230) of a 1/4 pixel 
resolution, and 
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a bilinear filter for filtering the pixel values of said 1/4 pixel resolution for 
determining additional pixel values (A) in-between to obtain pixel values of a 
1/8 pixel resolution 

characterized in that said calculator for the 1/4 pixel resolution includes 

a first filter for calculating intermediate pixel values (220) based on pixel 
values (210) of a current image in a first direction in accordance with the 
following equations: 

hi = -/ • A h + 3 • B h - 10 ■ C h + 59 ■ D h + 18 ■ E h - 6 ■ F h + 1 • G h - 0 ■ H h , 
h 2 = -1 • A h + 4 ■ B h - 10 • C h + 39 • D h + 39 • E h - 10 ■ F h + 4 G h - 1 ■ H h , 
h 3 =-0- A h + 1 B h - 6 Ch + 18 D h + 59 Eh-10 Fh + 3 Gh-1- Hh 

wherein A h , Bh, Ch, D h E h , F h , G h and H h represent neighboring pixel values 
(210) of the current image in said first direction of pixels and hj, h 2 , and h 3 
represent the intermediate pixel values (220) calculated between two of said 
neighbouring pixel values (21 0) of the current image, 

a first shifting unit for downshifting the binary representation of the pixel values 
(220) from said first filter by six bits, 

a second filter for calculating intermediate pixel values (230) based on pixel 
values obtained from said first filter and said first shifting means in a second 
direction in accordance with the following equations: 

vj=-l- D v .3 + 3 ■ Dv.2-10 ■ D v .t + 59- D v + 18- D v+J - 6 • D v+2 + 1 • D v+3 -0 ■ D v+4 , 
v 2 = -1 ■ Dy.3 + 4 - Dv. 2 - 10 ■ D v .j + 39 D V + 39- D v+J - 10 ■ D v+2 + 4 ■ D v+3 - 1 ■ D v+4 , 
v 3 =-0- D v . 3 + 1- D v . 2 - 6 ■ D v . } + 18 D v + 59- D v+] - 10 • D v+2 + 3 ■ D v+3 - I ■ D v+4 



GRONECKER • KINKELDEY • STOCKMAIR & SCHWANHAUSSER 



-23- 



EP31915 



wherein D v . 3 , D v . 2 , D v . lt D v , D v+I , D v+2 . D v+3 and D v+4 represent neighboring pixel 
values in said second direction of pixels and v y , v 2 , and v 3 represent the 
intermediate pixel values (230) calculated between two of said neighbouring 
pixel values, and 

a second shifting unit for downshifting the binary representation of said 
intermediate pixel values (230) from said second filter by six bits. 



15. A pixel interpolator according to claim 14, wherein said first and said second 
directions of pixels are the horizontal and vertical direction. 



16. A motion estimator or compensator according to claim 14 or 15, further 
comprising clipping means for clipping the pixel values from said first and said 
second shifting means to a predefined range of allowable pixel values. 

17. A motion estimator or compensator according to claim 16, wherein the pixels 
of the image have a resolution of 3 bits and an allowable pixel value range 
from 0 to 255. 



18. A motion estimator or compensator according to any of claims 14 to 17, 
wherein said bilinear filter comprising a mean value calculator for calculating 
intermediate pixel values (A). 



19. A motion estimator or compensator according to any of claims 14 to 18, 
wherein said bilinear filter filtering pixels in horizontal and vertical direction. 
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20. A motion estimator or compensator according to any of claims 14 to 19, 
wherein said bilinear filter filtering pixels in diagonal directions. 

21. A motion estimator or compensator according to claim 20, wherein said 
bilinear filter filtering pixel values (E, F, H, I) which are directly derived from 
said pixel values (210) of said original image. 

22. A motion estimator or compensator according to claim 21, wherein said 
bilinear filter additionally taking the centre pixel value (240) between four pixel 
values (210) of the original image into account. 

23. A motion estimator or compensator according to any of claims 20 to 22, 
wherein said bilinear filter applying the following equations depending on the 
two pixel values (E, F; H, I) to be taken into account: 

D = (E + F + l)»l, 
G=(3H + I + 2)»2 

wherein D and G represent said bilinear filtered pixel values, E, F, H and / 
represent pixel values taken into account for said bilinear filtering, and » 
represents a binary downshift operation. 

24. An encoder for encoding a sequence of video images comprising a motion 
estimator in accordance with any of claims 14 to 23. 
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25. A decoder for decoding a sequence of encoded video images employing a 
motion compensator in accordance with any of claims 1 4 to 23. 
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The present invention provides an improved motion estimation by only employing a 
simplified hardware configuration and less computational effort. This is achieved by 
particular filter coefficients and additional downshift operations when calculating a 1/4 
sub-pixel resolution image. Accordingly, a more efficient encoding and decoding with 
a more simple configuration can be achieved. 
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